POI Extraction from the Web: Store Name Recognition and Address Matching

نویسندگان

  • Yu-Yang Lin
  • Chia-Hui Chang
چکیده

Mobility is one of the trends in 2014. According to the report of IDC (International Data Corporation), the worldwide shipments of tablets have exceeded PCs in 2013 Quarter 4, while smart phones has already exceeded other devices in unit shipments and market ratio. With this trend, many location-based services (LBS) have been proposed, for example, navigation, searching restaurants or gas stations. Therefore, how to construct a large POI (Point-of Interest) database is the key problem. In this paper, we solve three problems including Taiwan address normalization, store name extraction, and the matching of addresses and store names. To train a statistical model for store name extraction, we make use of existing store-address pair to prepare training data for sequence labeling. The model is trained using common characteristics from store names in addition to POS tags. When testing on search snippets, we obtain 0.791 F-measure for store name recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

應用興趣點辨識技術從 Web 中挖掘新商家資訊 (Mining POIs from Web via POI recognition and Relation Verification) [In Chinese]

This paper presents a system that could automatically extract new POIs from Web. First, we use special queries (e.g. Taipei+New Open) to find Web pages that might contain addresses for new stores. For web pages that contain addresses, we then apply store name recognition model to extract possible POIs. Finally, we train a model to find the most possible POI for the address found in the page. In...

متن کامل

Verification of POI and Location Pairs via Weakly Labeled Web Data

With the increased popularity of mobile devices and smart phones, location-based services (LBS) have become a common need in our daily life. Therefore, maintaining the correctness of POI (Points of Interest) data has become an important issue for many location-based services such as Google Maps and Garmin navigation systems. The simplest form of POI contains a location (e.g., represented by an ...

متن کامل

POI Type Matching based on Culturally Different Datasets

The development of mobile social media networks has changed our daily life. More and more people tend to share their locations, emotions and activities with their friends, which are called ‘check-in records’. These geo-tagged data create an unprecedented opportunity for researchers to reveal spatio-temporal activity patterns of citizens [1,2], capture usages of urban public facilities [3], and ...

متن کامل

Vehicle Logo Recognition Using Image Matching and Textural Features

In recent years, automatic recognition of vehicle logos has become one of the important issues in modern cities. This is due to the unlimited increase of cars and transportation systems that make it impossible to be fully managed and monitored by human. In this research, an automatic real-time logo recognition system for moving cars is introduced based on histogram manipulation. In the proposed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJCLCLP

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2014